Search | WHO COVID-19 Research Database

Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (preprint)

Preslav Nakov; Giovanni Da San Martino; Tamer Elsayed; Alberto Barrón-Cedeño; Rubén Míguez; Shaden Shaar; Firoj Alam; Fatima Haouari; Maram Hasanain; Watheq Mansour; Bayan Hamdan; Zien Sheikh Ali; Nikolay Babulkov; Alex Nikolov; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Mucahid Kutlu; Yavuz Selim Kartal.

arxiv; 2021.

Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2109.12987v1

ABSTRACT

We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting tasks related to factuality, and covers Arabic, Bulgarian, English, Spanish, and Turkish. Task 1 asks to predict which posts in a Twitter stream are worth fact-checking, focusing on COVID-19 and politics (in all five languages). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims (in Arabic and English). Task 3 asks to predict the veracity of a news article and its topical domain (in English). The evaluation is based on mean average precision or precision at rank k for the ranking tasks, and macro-F1 for the classification tasks. This was the most popular CLEF-2021 lab in terms of team registrations: 132 teams. Nearly one-third of them participated: 15, 5, and 25 teams submitted official runs for tasks 1, 2, and 3, respectively.

Subject(s)

COVID-19

ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection (preprint)

Fatima Haouari; Maram Hasanain; Reem Suwaileh; Tamer Elsayed.

arxiv; 2020.

Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2010.08768v2

ABSTRACT

In this paper we introduce ArCOV19-Rumors, an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020. We collected 138 verified claims, mostly from popular fact-checking websites, and identified 9.4K relevant tweets to those claims. Tweets were manually-annotated by veracity to support research on misinformation detection, which is one of the major problems faced during a pandemic. ArCOV19-Rumors supports two levels of misinformation detection over Twitter: verifying free-text claims (called claim-level verification) and verifying claims expressed in tweets (called tweet-level verification). Our dataset covers, in addition to health, claims related to other topical categories that were influenced by COVID-19, namely, social, politics, sports, entertainment, and religious. Moreover, we present benchmarking results for tweet-level verification on the dataset. We experimented with SOTA models of versatile approaches that either exploit content, user profiles features, temporal features and propagation structure of the conversational threads for tweet verification.

Subject(s)

COVID-19

ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks (preprint)

Fatima Haouari; Maram Hasanain; Reem Suwaileh; Tamer Elsayed.

arxiv; 2020.

Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2004.05861v3

ABSTRACT

In this paper, we present ArCOV-19, an Arabic COVID-19 Twitter dataset that covers the period from 27th of January till 30th of April 2020. ArCOV-19 is the first publicly-available Arabic Twitter dataset covering COVID-19 pandemic that includes over 1M tweets alongside the propagation networks of the most-popular subset of them (i.e., most-retweeted and -liked). The propagation networks include both retweets and conversational threads (i.e., threads of replies). ArCOV-19 is designed to enable research under several domains including natural language processing, information retrieval, and social computing, among others. Preliminary analysis shows that ArCOV-19 captures rising discussions associated with the first reported cases of the disease as they appeared in the Arab world. In addition to the source tweets and the propagation networks, we also release the search queries and the language-independent crawler used to collect the tweets to encourage the curation of similar datasets.

Subject(s)

COVID-19

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL